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Abstract 

In many compressed sensing applications, linear programming (LP) has been used to reconstruct 
a sparse signal. When observation is noisy, the LP formulation is extended to allow an inequality 
constraint and the solution is dependent on a parameter 5, related to the observation noise level. 
Recently, some researchers also considered quadratic programming (QP) for compressed sensing 
signal reconstruction and the solution in this case is dependent on a Lagrange multiplier (3. In this 
work, we investigated the relation between 5 and (3 and derived an upper and a lower bound on /3 
in terms of 5. For a given (5, these bounds can be used to approximate /3. Since 5 is a physically 
related quantity and easy to determine for an application while there is no easy way in general to 
determine j3, our results can be used to set /3 when the QP is used for compressed sensing. Our results 
and experimental verification also provide some insight into the solutions generated by compressed 
sensing. 

1. Introduction 

In many compressed sensing applications, signal reconstruction is carried out by solving a linear 
programming (LP) problem [1]. When observation is noiseless, the LP problem is: 



(LP) : min||ti||i subject to Au = b (1) 

where u is the signal to be reconstructed (n-dim vector, sufficiently sparse), ^ is an m x n sampling 
matrix, with m <C n, 6 is the observation (an m-dim vector), and the minimization is with respect 
to u. When the observation is noisy (with bounded noise), the LP formulation is extended to 



(LPn) : min||ii||i subject to \\Au - b\\l < 5^ (2) 

where 5^ is a bound on noise power, or noise level. Although strictly speaking this is no longer a 
linear programming problem (it is still a convex optimization problem), because its relation to eqn 
(??) and for the sake of simplicity, we will still refer to it as a part of the LP problem (LPn). 
In some work, a quadratic programming (QP) problem, with 

(QP): mJ^\\Au-b\\l + l3\\u\\A (3) 
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has been considered for compressed sensing. For example, Fuchs [21 [3] and Troop [1] used the 
QP formulation to study the theoretical properties of the solutions of various compressed sensing 
problems. Similarly, Chen et al. (e.g., see [5]) used a QP-type formulation in several papers for 
compressed sensing based 3D CT (computer tomography) [in their case, the li norm is replaced by 
total variation (TV)]. Finally, the QP problem is closely related to (in fact equivalent to) the Lasso 
procedure [6] which is widely used in statistics, pattern recognition, and data mining. 

Given the LP and QP formulations, a natural question is: when are they the same, i.e., producing 
the same results? When the observation is noiseless, Fuchs ^ showed that the QP becomes the same 
as the LP when /3 — )• O"*" (i.e., from the right). When the observation is noisy, Fuchs [3] pointed out 
that for a given noise bound 5^ in the LPn, there exists a /3 > for the QP such that the resulting 
QP is the same as the LPn, i.e., they produce the same solutions; in fact, this can be established 
through the theory of duality [7j. However, Fuchs also mentioned that it is difficult to find an explicit 
(e.g., an analytic) relation between this /3 and the 5'^ in LPn. This means that when a QP algorithm 
is used in practice for compressed sensing, such as in the previously mentioned 3D CT and Lasso 
applications, it may not actually be performing compressed sensing (as defined by the LPn) since, 
in these applications, it is unclear how to find the "right" /3. Indeed, in practice /3 is often selected 
experimentally. However, as illustrated in Fig. 1 (figures are all at the end of the paper), for a given 
compressed sensing problem with a given noise level {S'^), the solution from the QP is dependent 
on parameter /3 and most of the time, it is not the same as that from the LPn (this is illustrated 
through the QP's reconstruction error for various /3 values). 

In this paper, we attempt to find an analytic relation between /? and 6^ (or equivalently, between 
/3 and 6). Such a relation is useful in three respects. First, it may allow us to gain more insight 
into the relation between the LP and the QP problems, thereby more insight into the nature of 
the solution of the compressed sensing problem. Second, if in practice we want to use the QP or 
Lasso for some reason (e.g., familiarity, easier implementation, or faster speed) as an algorithm for 
compressed sensing, we can obtain or estimate the appropriate f3 from 5, which, as the noise level, 
is a physical parameter and usually readily available. Finally, many signal/image processing and 
computer vision problems are solved by a Bayesian formulation where an energy function related 
to the posterior probability distribution is minimized. Usually, the energy function is a sum of two 
terms: the first is related to the observation model and is similar to, or the same as, the first term in 
QP [see eqn ([3])] and the second is related to prior constraints and is similar to the second term in the 
QP. The two terms are "balanced" by a parameter /3 just like that in the QP formulation. In many 
such Bayesian applications, selecting the value of /3 is a problem and there is no analytic/theoretical 
guidance. Our work sheds light on this problem and could potentially be used to find a solution. 

The rest of the paper is organized as follows. In Section 2, we derive some analytic relations 
between /3 and 5 and in Section 3, we verify and illustrate some of these relations experimentally. 
Finally, in Section 4, we provide conclusions. 

2. Analytic Results 

In this section, we derive two relations between /3 and 5, one in inequalities and the other in an 
equality. 
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2.1. Inequality Relations 

Suppose for a noise level or bound 5^, the LPn problem of eqn ([2]) has a sparse solution. Then, as 
described in Section 1, there is a /3 > such that for this f3, the solution of the QP problem of ([3]) 
is the same as that of the LPn. Let x be this "common" solution and suppose it has k non-zero 
components. According to Fuchs [21 [3], this solution must satisfy the following condition (can also 
be viewed as part of the KKT condition [7]) 

A^ib - Ax) = /3sgn(x) (4) 

where x is the "reduced solution vector," made up by the non-zero components of x, ^ is a m x A; 
matrix made up by the columns of matrix A corresponding to x, and sgn(-) is the usual sign function 
that for a scalar t 

/ -1, if t < 
^S^^*) = 1+1, if i > 

while for a vector v, sgn{v) is applied component-by-component, leading to a vector of +ls and —Is. 
Now, taking the I2 norm square on both sides of (HD, we have 

(6 - Axf{AA'^){b - Ax) = /32||sgn(x)||2 = f3^k (6) 
where we used the fact that 

\\sgn{x)f = {Vk)^ = k (7) 

Note that AA'^ is a m X m correlation matrix and is semi positive definite. Hence, its eigenvalues 
are non-negative. Denoting the largest among these as Xmax and using the relation between matrix 
norm and maximum eigenvalues |8j, we have 

(6 - Ax)^{AA'^){b - Ax) < X^axWb - Ax\\^ (8) 
Because of eqn ([U]), we can also write this as 

^^k<Xmax\\b-Ax\\^ (9) 

As described previously, x (and x) is also a solution of the LPn problem, it satisfies the inequality 
constraint of eqn ([2]). In fact, based on the results in the Appendix, this solution achieves equality 

\\b - Ax\\'^ = 5'^ (10) 

Hence, we have 

f3^k < XraaxS^ (11) 

That is. 



/3 < y ^5 (12) 

Given 5^ this provides an upper bound on (3. In practice, we could also use this upper bound as an 
approximation to (3 with 
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13 ^^^S (13) 

As demonstrated in Section 3, this approximation can often be quite good. Finahy, it is interesting 
to note that if we let — >• 0'*', the LPn problem becomes the LP problem. In this case, the upper 
bound suggests that /3 — >• 0"'', reproducing Fuch's noiseless result for the relation between /3 and S 
(see Section 1). 

Using the techniques for deriving the above upper bound, we can also derive a lower bound. How- 
ever, since AA^ is semi positive definite rather than positive definite, this is slightly less straightfor- 
ward and requires some approximations. Specifically, since Ais an mx k matrix and since generally, 
m > A: (in practice, m is usually on the order of 5k [l]), the rank of matrix AA^ is at most k. 
Since AA^ is an m x m matrix, it has at most k non-zero eigenvalues. Assume this to be the case 
and denote the smallest non-zero eigenvalue be denoted as Amm- Let the eigenvectors of AA^ be 
ei, 62, . . . , em, where they are ordered according to the value of their eigenvalues, ei for Xmax, 
for Xmin, and e^+i, . . . , for eigenvalue. Since b — Ax is an m-dimensional vector, it can be 
represented by ei , 62 , . . . , em , with 

m 

b-Ax = Y^ aid (14) 

i=l 

where are representation coefficients. From this, we have 



, m \T , m \ m k k 

{b-Axf{AA''){b-Ax)= [Y^aieA {AA^)iY,aiei \ =Y.Kal = Y^Kaj > (15) 

^«=i ^ ^j=i ^ i=i i=i i=i 

Now, we find an estimate of X]i=i ctf • First we notice that the larger sum 

m 

Y,al = \\b-Ax\\\ (16) 

i=l 

Hence, from eqn (jlOp we have 

m 

Y.»1 = S' (17) 

i=l 

Furthermore, we notice that b — Ax = b — Ax = w is the noise. Assume the noise is white (i.e., 
uncor related), on average af are roughly the same jl2j , at 5'^/m. Hence, we have 

ta^^^S^ (18) 

i=l 



Now, combine eqns ([6]), p5|) . and p8j) . we have 



Xrmn-6^ < P^k (19) 

m 



That is. 



< 13 (20) 



m 
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This provides a lower bound on (3 for a given 5. 

The only problem left now is to find the eigenvalues Xmin and \max- In general these eigenvalues 
are dependent on the specifics of the A matrix, such as which columns correspond to the non-zero 
elements of x. However, there is an important case of practical importance where these eigenvalues 
can be found relatively easily. This is the widely used case where A is an i.i.d. Gaussian random 
matrix. In this case, it has been shown in previous work [9l [1] that the smallest and the largest 
eigenvalues for AA^ are asymptotically 

Xmin^ma'^il- ^f, \max^ma'^{l + ^f, (21) 

where m is the number of rows in A and A, a"^ is the variance of each component of A (and A), 
J = k/m (recall that k is the dimension of x, also the number of columns of A). Plugging these into 
the bounds of eqns (|12p and (|20p . we then have 

(1 - V7)-5 < /3 < ^'^' ^ (22) 

When a compressed sensing application uses a Gaussian random sampling matrix, the inequality of 
can be used to find the range of, or estimate, /3 from a given 5^. 



2.2. An Equality Relation 

If we add an additional assumption to the derivations in Section 2.1, we can obtain an equality 
relation between 6 and /3. Specifically, if we assume that in eqn Q the k x k matrix A^A is 
invertible, as Fuchs did in his papers [21 [3], then eqn ^ becomes 

X = {A^A)-\A^b - /3sgn(x)) (23) 

and this provides a solution to the QP problem. As we mentioned previously, when /3 matches 5, 
this solution is the same as that of the LPn. Furthermore, it can be shown that the solution of the 
LPn must satisfy the inequality with equality (see Appendix). Hence, the solution of ()23p should 
satisfy 

\\Ax-b\\l = d^ (24) 

or 

x'^A^Ax - 26^1x + 1 1 6| 1 2 = 52 (25) 

where we have dropped the subscript 2. 

Plugging the right hand side of (|23p into (j25p and denoting sgn(3;) as vector c to simplify notation, 
the first term of (|25p becomes 



x'^A'^Ax = [{A^A)-\A^b - (3c)fA^A[{A^A)-\A^b - (3c)] 
= (A^b - Pcf[{A^A)-YiA^A)[iA''A)-\A^b - /3c)] 
= Wbf - f3c^][{A^A)-\A^b - (3c)] 

= b'^A{A^A)-^A^b + c^{A^A)-^c(3^ - 2b^A{A^A)-^cP (26) 
where we have used the fact that A is symmetrical and so is its inverse, i.e., [(A^j4)~^]^ = 

{A^A)-\ 
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Similarly, for the second term of (j25p . we have 

~2b^Ax = -2b^A[{A^A)-^{A^b - /3c)] 

= -2b^A{A^A)-^A^b + 2b'^A{A^A)-^cP (27) 
Combine this and the results of (j26p into (j25p . we have 

11^5; _ 6||2 = -5^i(i^i)-iA^6 + c^(i^i)-ic/32 + = ^2 (28) 



where the terms linear in /3 (|26p and (j27p canceled each other. From this, we can find /3 in terms of 
6 as: 



^ = V c^WIy^c ^^^^ 

Although this equality provides a more explicit relation between /3 and 6'^, in practice it is more 
difficult to use than the inequalities of Section 2.1 since c and A are generally not known before QP 
and LP are performed. 



3. Experimental Verification 

In this section, we provide some experimental (simulation) results that verify and illustrate the 
inequality relations between 5 and /3 derived in Section 2.1. In each experiment, we picked a LPn 
problem with a given noise level 5'^ [see eqn ([2])] and formed a corresponding QP problem [see eqn 
([3])]. Then, the QP problem was solved for a range of /3s and from the resulting solutions, we would 
try to identify the best /3 corresponding to the 6 since, according to the theory of duality, the best 
/3 should result in the same solution as that of the LPn with 6'^. From this, we could see if, or how 
well, the best /3 satisfies the upper and lower bounds derived in Section 2.1. Next, we describe the 
specific steps in our experiments. 

3.1. Experiment Steps 

Each experiment consists of the following steps: 

1. Generate a sparse random n-dimensional signal x*. 

2. Generate a noisy observed signal b = Ax* + w, where A is an m x n Gaussian random sampling 
matrix with m < n and component variance a'^ and w is an m-dimensional additive white 
Gaussian noise vector with variance o"^. 

3. Reconstruct x* by solving the LPn problem with S'^ set to 5"^ = ma'^. Denote the resulting 
solution as x{5). 

4. Reconstruct x* by solving the QP problem with a range of /3s. Denote the resulting solution 
as x(/3). 

5. Compare the minimum obtained by LPn, i.e., ||x((5)||, and the maximum of the dual function 
obtained by QP, g{l3) (more details later). 
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6. Compare the normalized reconstruction errors (i.e., — x||2/||x*||) obtained by LPn and QP 
(x could be either x{6) or £(/?)). 

7. Find "the best /3" (that produces the same QP solution as the LPn solution) and compare this 
P with the bounds in eqns and ([2U|) . 

We now explain each of these steps in some detail. In Step 1, the original random sparse signal 
X* was obtained from examples provided/generated by the LI Magic software |10| (which uses these 
to illustrate the workings of compressed sensing algorithms). Specifically, in our experiments x* 
is a sparse random vector of dimension n = 256 and its non-zero components consists of A: = 24 
randomly placed +ls and -Is (also randomly chosen), as shown in Fig. 2. 

In Step 2, the noisy observed signal b was generated using a m x n Gaussian random sampling 
matrix A with m = 100 and component variance = 1; for the additive noise w, we used a white 
Gaussian noise with variance (whose value is different in different experiments, more details 
later). Some typical noisy observed signals are also shown in Fig. 2. 

In Step 3, the LPn problem was solved with a log barrier algorithm in LI Magic and in Step 4, 
the QP problem was solved with the LI Regularization software developed by Kim et al [llj. 

In Step 5, what we are really doing is to use results of the duality theory [7J to find the best /3. 
Specifically, for the LPn problem, one can define a dual function 



giX) = inf„|||7x||i + Adl^n - b\\l - 5^)j (30) 
Because the LPn satisfies the strong duality condition (see [7]), we have 

Pill > 5(A) foraUA>0 (31) 

where x is the solution of the LPn problem and equality is achieved at the best A, denoted as Aq, 
with 

Ao = arg max (7(A) and | |x| |i = (^(Ao) (32) 
Now, the dual function g{X) can be linked to the QP solution: we can re-write it as 

gi\)=mfuS^\\u\\i + x(^\\Au-b\\l-d^^^ = 2Xmfu^^^\\Au-b\\l^ (33) 

where the inf^{-} is the QP solution with 1/2A = /3 . In this way, whenever a QP problem with 
/3 is solved, we can compute a corresponding g{X) with A = 1/2/3. In this sense, we can write 
(re-parameterize) g{X) as g{(3) and the strong duality of eqn (j32p can be re-written in terms of /3 as 

/3o = argmaxg(/3) and \\x\\i = g{(3o) (34) 

where (3q is the best /3 (which makes the QP having the same solution as that of the LP). 

Finally, Steps 6 and 7 are relatively straightforward and we will discuss our experimental results 
next. 
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3.2. Experimental Results 

Some typical experimental results are shown in Figs. 3-14. In Fig. 3, the observation noise variance 
is cr^ = 0.0225, corresponding to an SNR of 30dB (low noise) and a noise bound of 5^ = ma'^ = 
100 X 0.0225 = 2.25. Fig. 3 contains information obtained in Step 5, i.e., the minimum achieved 
by the LPn the re-parametized dual function g(/3), and the upper and lower bounds for (3 

we derived in Section 2.1. Note that since the minimum achieved by PLn is a number (constant) 
while the dual function is a function of /?, we presented the former as a constant line. Similarly, the 
bounds are numbers, i.e., specific values of /3, hence are presented as vertical lines. 

From the results of Fig. 3, we can make two observations. First, the experimental results agrees 
with the prediction of the duality theory. That is, the g{P) curve is always below the | |x| |i line and for 
the "best" /?, the curve approaches the line. Second, the best (3 falls in an interval predicted/defined 
by our upper and lower bounds (almost right in the middle of the interval). This is very encouraging. 

Fig. 4 compares the normalized reconstruction errors (see Step 6) for the LPn and QP. The 
former is a number, presented as a horizontal line while the latter is a function of /3, hence is a curve. 
At the "best" (3 [i.e., when eqn (I34p is satisfied or, when the curve meets the straight line in Fig. 
3], the QP and LP have the same reconstruction errors. Hence, looking at reconstruction errors of 
QP and LP provides another potential wa}0 to identifying the "best" /?, as can be seen in Fig. 4; 
for the "best" /3, the QP has the same reconstruction error as that of LP. From Fig. 4, we can also 
observe that /3 can have a strong effect on reconstruction error. Finally, we note that the best (3 
does not necessarily lead to minimum reconstruction error for QP since the best f3 is best in the 
sense of duality theory (providing the same solution as that of LPn), not in the sense of minimum 
reconstruction error. Currently, it is not obvious as to how to find the best f3 in this latter sense. 

To ensure that our results in Figs. 3 and 4 are no accidents, we repeated that experiment 100 
times (each time with a new random sparse signal, random sampling matrix, and additive noise 
vector) and averaged their results. These are shown in Figs 5 and 6. As can be seen, the nature of 
the results are the same as that of Fig. 3 and 4. 

Finally, we repeated the experiment for Figs. 3-6 for a higher noise level, with al = 0.2025, 
corresponding to an SNR of 20dB (heavy noise) and a noise level of (5^ = 100 x 0.2025 = 20.25. The 
results are presented in Figs. 7-10. The nature of the results is the same as that of Figs. 3-6: our 
derived bounds worked well. Furthermore, compared with Figs 3-6, we can see that as the noise 
reduces (from 20dB to 30dB SNR), the bounds and the best /3 move to the left, agreeing with the 
theoretical prediction that the best /3 — t- 0"*" when the noise level reduces to 0. 

Appendix 

Consider the solution to the ii minimization problem 

(Pi) min||n||i subject to — 6||2 < i^- 

First we note that if ||5||2 < S, then a; = is the solution to (Pi). To avoid this trivial case, we 
assume ||6||2 > 6. 

The following result belongs to a well-known result in convex optimization, known as the maxi- 
mum principle. We include it here for the reader's convenience. Plus, our proof is specialized to the 
(Pi) problem. 

^Sometimes, the normalized QP error curve can intersect the LP error hne at more than one /?, in this case, we 
need to rely on the dual function to identify the best (3. 
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MELximum Principle: Let x be a minimizer of (Pi) and if :c 7^ 0, then 

\\Ax-h\\2 = 5. 

Proof: In fact, since a; / 0, we may assume &(zo) 7^ for some {x{i) is the ?th component of x). 
Suppose X is not on the boundary, then 

d = 5- \\Ax-h\\2 > 0. 

Choose a small t such that \x{io) — t\ < \x{io)\ and — x')\\2 < d ( here x'{i) = x{i) for i / io 

and x'(zo) = f(io) — i), we get a contradiction because H^f — 6||2 < 5 and < ||x||i. This proves 

the maximum principle. 
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Figure 1: QP's Reconstruction Results as a Function of (3. Curve: QP's reconstruction error as a 
function of /?; Straight line: LP's reconstruction error, n = 100, k = 10, m = 50,6^ = 0.75 (SNR = 
37dB) 
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Figure 2: A Sparse Random Signal and its Noisy Observation. Top: the sparse random signal, 
bottom: the noisy observed signal with an SNR of 20dB. 
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Figure 3: The Minimum from LP and the Dual Function g{f3) from QP: the 30dB SNR Case. Straight 
hne: the minimum from LP, curve: the dual function from QP. 
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Figure 4: Recovery Error from LP and QP: the 30dB SNR Case. The straight line: LP, the curve: 
QP. 
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Figure 5: Experiment in Figs. 3 and 4 Repeated 100 Times: Averaged Minimum from LP and 
Averaged Dual Function from QP. 
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Figure 6: Experiment in Figs. 3 and 4 Repeated 100 Times: Averaged Reconstruction Errors from 
LP and QP. 
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Figure 7: The Minimum from LP and the Dual Function g{/3) from QP: the 20dB SNR Case 
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Figure 8: Recovery Error from LP and QP: the 20dB SNR Case. The straight hne: LP, the curve: 
QP. 
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Figure 9: Experiment in Figs. 7 and 8 Repeated 100 Times: Averaged Minimum from LP and 
Averaged Dual Function from QP. 
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Figure 10: Experiment in Figs. 7 and 8 Repeated 100 Times: Averaged Reconstruction Errors from 
LP and QP. 
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